{ "cells": [ { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import warnings\n", "warnings.filterwarnings('ignore')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Instalação:\n", "\n", "Para criar o ambiente com as ferramentas necessárias:\n", "\n", "`conda create --name textrepr python=3 scikit-learn numpy scipy gensim nltk pandas jupyter ipython`\n", "\n", "Para ativar o ambiente:\n", "\n", "`source activate lconmeetings`\n", "\n", "`pip install powerlaw`\n", "\n", "Para executar o ambiente de programação, no mesmo diretório do arquivo Text Representations.ipynb digite:\n", "\n", "`jupyter notebook`\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import sklearn.feature_extraction.text as txtfeats\n", "import powerlaw\n", "from nltk import skipgrams" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", " | Unnamed: 0 | \n", "date | \n", "time | \n", "screen_name | \n", "text | \n", "
---|---|---|---|---|---|
0 | \n", "0 | \n", "2016-04-15 | \n", "00:51:27 | \n", "TierroDavid | \n", "rt revistaepoca dilma diz que se resistir ao i... | \n", "
1 | \n", "1 | \n", "2016-04-15 | \n", "00:52:20 | \n", "TierroDavid | \n", "rt folha grupos pro e contra impeachment convo... | \n", "
2 | \n", "2 | \n", "2016-04-15 | \n", "00:54:06 | \n", "TierroDavid | \n", "estadao estadao irresponsavel patrocinando o g... | \n", "
3 | \n", "3 | \n", "2016-04-15 | \n", "00:11:32 | \n", "lisadoflop | \n", "lembrei de alguem com essa foto httpstcoxjwnbo... | \n", "
4 | \n", "4 | \n", "2016-04-15 | \n", "01:27:48 | \n", "TierroDavid | \n", "rt folha ministros sao exonerados para votarem... | \n", "